Practical Data Science by Mario Rojas
Author:Mario Rojas [Mario Rojas]
Language: eng
Format: epub, pdf
Publisher: UNKNOWN
Published: 2021-01-14T00:00:00+00:00
Kingdom animalia Class Mammalia order Carnivora Family Canidae
Genus Canis
specie Lupus familiaris
He is classified as Canis Lupus familiaris.
Youmust look at an object as revealed by recently discovered knowledge. Now, I will guide you through a Python code session, to convert the flat file into a graph with knowledge. The only information you have are in fileAnimals.csvin directory.. VKHCG-Hillman -RawData.
The format is
ItemLevel ParentID ItemID ItemName
0 0 50
0 0 202422
1 50 956096
1 50 956097
The field has the following meanings: Bacteria
plantae
Negibacteria posibacteria
⢠ItemLevelis how far the specific item is from the top node in the classification.
⢠ParentIDis the ItemID for the parent of the Item listed.
⢠ItemID is the unique identifier for the item.
⢠ItemNameis the full name of the item.
Thedatafitstogether as aconsiderable treeofclassifications. Youmustcreatea graph that gives you the following:
Bacteria-> Negibacteria and Bacteria-> Posibacteria
Following is the code to transform it. You will perform a few sections of data preparation,data storagefortheretrieve,Assesssupersteps,andthen wewillcomplete the Process step into the data vault. You start with the standard framework, so please transfer the code to your Python editor. First,letâsset up the data:
################################################################ # -*- coding: utf-8 -*
################################################################ import sys
import os
import pandas as pd
import networkx as nx
import sqlite3 as sq
import numpy as np
################################################################ if sys.platform == 'linux':
Base=os.path.expanduser('~') + '/VKHCG'
else:
Base='C:/VKHCG'
print('################################')
print('Working Base :',Base, ' using ', sys.platform)
print('################################')
################################################################ ReaderCode='SuperDataScientist'
Please replace the'Practical Data Scientist'in the next line with your name. ReaderName='Practical Data Scientist'
You now set up the locations of all the deliverables of the code.
################################################################ Company='03-Hillman'
InputRawFileName='Animals.csv'
EDSRetrieveDir='01-Retrieve/01-EDS'
InputRetrieveDir=EDSRetrieveDir + '/02-Python'
InputRetrieveFileName='Retrieve_All_Animals.csv'
EDSAssessDir='02-Assess/01-EDS'
InputAssessDir=EDSAssessDir + '/02-Python'
InputAssessFileName='Assess_All_Animals.csv'
InputAssessGraphName='Assess_All_Animals.gml'
You now create the locations of all the deliverables of the code.
################################################################ sFileRetrieveDir=Base + '/' + Company + '/' + InputRetrieveDir if not os.path.exists(sFileRetrieveDir):
os.makedirs(sFileRetrieveDir)
############################################### ################# sFileAssessDir=Base + '/' + Company + '/' + InputAssessDir if not os.path.exists(sFileAssessDir):
os.makedirs(sFileAssessDir)
################################################################ sDataBaseDir=Base + '/' + Company + '/03-Process/SQLite' if not os.path.exists(sDataBaseDir):
os.makedirs(sDataBaseDir)
################################################################ sDatabaseName=sDataBaseDir + '/Hillman.db'
conn = sq.connect(sDatabaseName)
################################################################ # Raw to Retrieve
################################################################
You upload the CSV file with the flat structure.
sFileName=Base + '/' + Company + '/00-RawData/' + InputRawFileName print('###########')
print('Loading :',sFileName)
AnimalRaw=pd.read_csv(sFileName,header=0,low_memory=False, encoding = "ISO-8859-1")
AnimalRetrieve=AnimalRaw.copy()
print(AnimalRetrieve.shape)
################################################################
You store the Retrieve steps data now.
sFileName=sFileRetrieveDir + '/' + InputRetrieveFileName print('###########')
print('Storing Retrieve :',sFileName)
AnimalRetrieve.to_csv(sFileName, index = False)
You store the Assess steps data now.
################################################################ # Retrieve to Assess
################################################################ AnimalGood1 = AnimalRetrieve.fillna('0', inplace=False) AnimalGood2=AnimalGood1[AnimalGood1.ItemName!=0]
AnimalGood2[['ItemID','ParentID']]=AnimalGood2[['ItemID','ParentID']]. astype(np.int32)
AnimalAssess=AnimalGood2
print(AnimalAssess.shape)
################################################################ sFileName=sFileAssessDir + '/' + InputAssessFileName
print('###########')
print('Storing Assess :',sFileName)
AnimalAssess.to_csv(sFileName, index = False)
################################################################ print('################')
sTable='All_Animals'
print('Storing :',sDatabaseName,' Table:',sTable)
AnimalAssess.to_sql(sTable, conn, if_exists="replace")
print('################')
Youstart with the Process steps, to process the flat data into a graph. Youcan now extract the nodes, as follows:
################################################################
print('################')
sTable='All_Animals'
print('Loading Nodes :',sDatabaseName,' Table:',sTable)
sSQL=" SELECT DISTINCT"
sSQL=sSQL+ " CAST(ItemName AS VARCHAR(200)) AS NodeName,"
sSQL=sSQL+ " CAST(ItemLevel AS INT) AS NodeLevel"
sSQL=sSQL+ " FROM"
sSQL=sSQL+ " " + sTable + ";"
AnimalNodeData=pd.read_sql_query(sSQL, conn)
print(AnimalNodeData.shape)
Youhave now successfully extracted the nodes. Well done. Youcan now extract the edges. You will start with the Process step, to convert the data into an appropriate graph structure.
################################################################ print('################')
sTable='All_Animals'
print('Loading Edges :',sDatabaseName,' Table:',sTable) sSQL=" SELECT DISTINCT"
sSQL=sSQL+ " CAST(A1.ItemName AS VARCHAR(200)) AS Node1," sSQL=sSQL+ " CAST(A2.ItemName AS VARCHAR(200)) AS Node2" sSQL=sSQL+ " FROM"
sSQL=sSQL+ " " + sTable + " AS A1"
sSQL=sSQL+ " JOIN"
sSQL=sSQL+ " " + sTable + " AS A2"
sSQL=sSQL+ " ON"
sSQL=sSQL+ " A1.ItemID=A2.ParentID;"
AnimalEdgeData=pd.read_sql_query(sSQL, conn)
print(AnimalEdgeData.shape)
You have now extracted the edges. So, letâs build a graph.
################################################################ G=nx.Graph()
t=0
G.add_node('world', NodeName='World')
################################################################
You add the nodes first.
GraphData=AnimalNodeData
print(GraphData)
################################################################ m=GraphData.shape[0]
for i in range(m):
t+=1
sNode0Name=str(GraphData['NodeName'][i]).strip() print('Node :',t,' of ',m,sNode0Name)
sNode0=sNode0Name.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7808)
Grails in Action by Glen Smith Peter Ledbrook(7719)
Azure Containers Explained by Wesley Haakman & Richard Hooper(6804)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(6801)
Running Windows Containers on AWS by Marcio Morales(6321)
Kotlin in Action by Dmitry Jemerov(5089)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(5048)
Combating Crime on the Dark Web by Nearchos Nearchou(4621)
Microsoft Cybersecurity Architect Exam Ref SC-100 by Dwayne Natwick(4573)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4437)
The Ruby Workshop by Akshat Paul Peter Philips Dániel Szabó and Cheyne Wallace(4312)
The Age of Surveillance Capitalism by Shoshana Zuboff(3977)
Python for Security and Networking - Third Edition by José Manuel Ortega(3875)
The Ultimate Docker Container Book by Schenker Gabriel N.;(3533)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3528)
Learn Wireshark by Lisa Bock(3491)
Mastering Python for Networking and Security by José Manuel Ortega(3376)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3353)
Blockchain Basics by Daniel Drescher(3322)
